Skip to content

feat: track process lineage for telemetry#216

Draft
mabdinur wants to merge 7 commits into
mainfrom
munir/add-runtime-id-propagation
Draft

feat: track process lineage for telemetry#216
mabdinur wants to merge 7 commits into
mainfrom
munir/add-runtime-id-propagation

Conversation

@mabdinur
Copy link
Copy Markdown
Contributor

@mabdinur mabdinur commented Apr 30, 2026

What does this PR do?

Propagates instrumentation session IDs across process boundaries so that telemetry and trace exports from child processes (spawned or forked) can be linked back to the root process in the Datadog backend.

New module telemetry_session (public API):

  • sessions_from_runtime_id(runtime_id) — builds a TelemetryInstrumentationSessions struct by reading _DD_ROOT_RS_SESSION_ID / _DD_PARENT_RS_SESSION_ID from the process environment
  • install_process_lineage_env(runtime_id) — writes _DD_ROOT_RS_SESSION_ID into the current process env on first call (idempotent, must be called before threads are spawned)
  • lineage_env_for_spawn(runtime_id) — returns the two env pairs to pass to a child process
  • extend_command_env_with_lineage(cmd, runtime_id) — convenience wrapper for std::process::Command
  • TelemetryInstrumentationSessions, ENV_ROOT_RS_SESSION_ID, ENV_PARENT_RS_SESSION_ID — re-exported from the crate root

Wiring:

  • Config::build calls install_process_lineage_env so the root session ID is stamped into the env before any threads start
  • make_telemetry_worker passes session_id / root_session_id / parent_session_id to libdd_telemetry::config::Config
  • DatadogExporter calls set_telemetry_instrumentation_sessions on the trace exporter builder

Adapts to libdatadog v33.0.0:

  • TraceExporter<C> is now generic; added pub type TraceExporter = LibddTraceExporter<NativeCapabilities> to keep the rest of the codebase unchanged
  • wait_agent_info_ready is now async; wraps it in a tokio::current_thread runtime in the test-utils path
  • Added libdd-capabilities-impl workspace dependency

Dependency pin ([patch.crates-io]): all six libdatadog crates are overridden with tag = v33.0.0 because crates.io libdd-data-pipeline 3.0.1 predates the telemetry feature and TelemetryInstrumentationSessions. The same patch is mirrored in instrumentation/Cargo.toml.

Motivation

Enables the Datadog backend to correlate telemetry and traces emitted by a parent process with those from its child processes, using the dd-session-id / dd-root-session-id / dd-parent-session-id headers already supported by libdatadog.

Additional Notes

  • Parametric tests: the ddtrace-rs-client workspace in DataDog/system-tests resolves libdatadog from crates.io and has no [patch.crates-io]. A companion PR to system-tests must add the matching patch section to utils/build/docker/rust/parametric/Cargo.toml before the parametric suite can build with this branch.
  • Drop the patch once a crates.io release of libdd-data-pipeline includes the telemetry feature (libdatadog v33+).
  • Integration tests require Docker / a running test-agent and are expected to fail in CI without one.

@mabdinur
Copy link
Copy Markdown
Contributor Author

Blocked by the release: DataDog/libdatadog#1822

mabdinur and others added 4 commits May 13, 2026 15:27
…-io]

The new [patch.crates-io] section in instrumentation/Cargo.toml was not
reflected in the lock file, causing `cargo build --locked` to fail in CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Point to the published crate versions from DataDog/libdatadog#1989:
  libdd-data-pipeline  3.0.1 → 3.1.0
  libdd-telemetry      4.0.0 → 5.0.0
  libdd-capabilities-impl  1.0.0 → 1.1.0

These versions include the APIs required for instrumentation session ID
propagation (TelemetryInstrumentationSessions, NativeCapabilities,
session_id fields on telemetry Config). With them published, the
[patch.crates-io] git overrides are no longer needed and are removed
from both Cargo.toml and instrumentation/Cargo.toml.

Note: this commit will not build until libdatadog#1989 merges and the
new crate versions are published to crates.io.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@mabdinur
Copy link
Copy Markdown
Contributor Author

Next steps before merging

This PR currently carries [patch.crates-io] git overrides for six libdatadog crates because the APIs it depends on (TelemetryInstrumentationSessions, NativeCapabilities, session_id fields on telemetry Config) were added in DataDog/libdatadog#1822 without incrementing the crate version numbers, so they cannot be consumed directly from crates.io.

The patch overrides have now been dropped in the latest commit, pointing instead at the versions that will exist once DataDog/libdatadog#1989 merges and the crates are published:

Crate Before After
libdd-data-pipeline 3.0.1 (git patch) 3.1.0
libdd-telemetry 4.0.0 (git patch) 5.0.0
libdd-capabilities-impl 1.0.0 (git patch) 1.1.0

This PR will not build until the following sequence completes:

  1. DataDog/libdatadog#1989 — bumps libdd-capabilities-impl to 1.1.0 and libdd-data-pipeline to 3.1.0 (already open)
  2. libdatadog publishes libdd-capabilities-impl 1.1.0, libdd-data-pipeline 3.1.0, and libdd-telemetry 5.0.0 to crates.io
  3. Regenerate lock files (Cargo.lock and instrumentation/Cargo.lock) against the new published versions
  4. Merge this PR — no system-tests changes required once the crates are published

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant